40 research outputs found
Sliding to predict: vision-based beating heart motion estimation by modeling temporal interactions.
PURPOSE: Technical advancements have been part of modern medical solutions as they promote better surgical alternatives that serve to the benefit of patients. Particularly with cardiovascular surgeries, robotic surgical systems enable surgeons to perform delicate procedures on a beating heart, avoiding the complications of cardiac arrest. This advantage comes with the price of having to deal with a dynamic target which presents technical challenges for the surgical system. In this work, we propose a solution for cardiac motion estimation. METHODS: Our estimation approach uses a variational framework that guarantees preservation of the complex anatomy of the heart. An advantage of our approach is that it takes into account different disturbances, such as specular reflections and occlusion events. This is achieved by performing a preprocessing step that eliminates the specular highlights and a predicting step, based on a conditional restricted Boltzmann machine, that recovers missing information caused by partial occlusions. RESULTS: We carried out exhaustive experimentations on two datasets, one from a phantom and the other from an in vivo procedure. The results show that our visual approach reaches an average minima in the order of magnitude of [Formula: see text] while preserving the heart's anatomical structure and providing stable values for the Jacobian determinant ranging from 0.917 to 1.015. We also show that our specular elimination approach reaches an accuracy of 99% compared to a ground truth. In terms of prediction, our approach compared favorably against two well-known predictors, NARX and EKF, giving the lowest average RMSE of 0.071. CONCLUSION: Our approach avoids the risks of using mechanical stabilizers and can also be effective for acquiring the motion of organs other than the heart, such as the lung or other deformable objects
LaplaceNet: A Hybrid Energy-Neural Model for Deep Semi-Supervised Classification
Semi-supervised learning has received a lot of recent attention as it
alleviates the need for large amounts of labelled data which can often be
expensive, requires expert knowledge and be time consuming to collect. Recent
developments in deep semi-supervised classification have reached unprecedented
performance and the gap between supervised and semi-supervised learning is
ever-decreasing. This improvement in performance has been based on the
inclusion of numerous technical tricks, strong augmentation techniques and
costly optimisation schemes with multi-term loss functions. We propose a new
framework, LaplaceNet, for deep semi-supervised classification that has a
greatly reduced model complexity. We utilise a hybrid energy-neural network
where graph based pseudo-labels, generated by minimising the graphical
Laplacian, are used to iteratively improve a neural-network backbone. Our model
outperforms state-of-the-art methods for deep semi-supervised classification,
over several benchmark datasets. Furthermore, we consider the application of
strong-augmentations to neural networks theoretically and justify the use of a
multi-sampling approach for semi-supervised learning. We demonstrate, through
rigorous experimentation, that a multi-sampling augmentation approach improves
generalisation and reduces the sensitivity of the network to augmentation
Contrastive Registration for Unsupervised Medical Image Segmentation
Medical image segmentation is a relevant task as it serves as the first step
for several diagnosis processes, thus it is indispensable in clinical usage.
Whilst major success has been reported using supervised techniques, they assume
a large and well-representative labelled set. This is a strong assumption in
the medical domain where annotations are expensive, time-consuming, and
inherent to human bias. To address this problem, unsupervised techniques have
been proposed in the literature yet it is still an open problem due to the
difficulty of learning any transformation pattern. In this work, we present a
novel optimisation model framed into a new CNN-based contrastive registration
architecture for unsupervised medical image segmentation. The core of our
approach is to exploit image-level registration and feature-level from a
contrastive learning mechanism, to perform registration-based segmentation.
Firstly, we propose an architecture to capture the image-to-image
transformation pattern via registration for unsupervised medical image
segmentation. Secondly, we embed a contrastive learning mechanism into the
registration architecture to enhance the discriminating capacity of the network
in the feature-level. We show that our proposed technique mitigates the major
drawbacks of existing unsupervised techniques. We demonstrate, through
numerical and visual experiments, that our technique substantially outperforms
the current state-of-the-art unsupervised segmentation methods on two major
medical image datasets.Comment: 11 pages, 3 figure
NorMatch: Matching Normalizing Flows with Discriminative Classifiers for Semi-Supervised Learning
Semi-Supervised Learning (SSL) aims to learn a model using a tiny labeled set
and massive amounts of unlabeled data. To better exploit the unlabeled data the
latest SSL methods use pseudo-labels predicted from a single discriminative
classifier. However, the generated pseudo-labels are inevitably linked to
inherent confirmation bias and noise which greatly affects the model
performance. In this work we introduce a new framework for SSL named NorMatch.
Firstly, we introduce a new uncertainty estimation scheme based on normalizing
flows, as an auxiliary classifier, to enforce highly certain pseudo-labels
yielding a boost of the discriminative classifiers. Secondly, we introduce a
threshold-free sample weighting strategy to exploit better both high and low
confidence pseudo-labels. Furthermore, we utilize normalizing flows to model,
in an unsupervised fashion, the distribution of unlabeled data. This modelling
assumption can further improve the performance of generative classifiers via
unlabeled data, and thus, implicitly contributing to training a better
discriminative classifier. We demonstrate, through numerical and visual
results, that NorMatch achieves state-of-the-art performance on several
datasets.Comment: Accepted to Transactions on Machine Learning Researc
Multi-Modal Hypergraph Diffusion Network with Dual Prior for Alzheimer Classification
The automatic early diagnosis of prodromal stages of Alzheimer's disease is
of great relevance for patient treatment to improve quality of life. We address
this problem as a multi-modal classification task. Multi-modal data provides
richer and complementary information. However, existing techniques only
consider either lower order relations between the data and single/multi-modal
imaging data. In this work, we introduce a novel semi-supervised hypergraph
learning framework for Alzheimer's disease diagnosis. Our framework allows for
higher-order relations among multi-modal imaging and non-imaging data whilst
requiring a tiny labelled set. Firstly, we introduce a dual embedding strategy
for constructing a robust hypergraph that preserves the data semantics. We
achieve this by enforcing perturbation invariance at the image and graph levels
using a contrastive based mechanism. Secondly, we present a dynamically
adjusted hypergraph diffusion model, via a semi-explicit flow, to improve the
predictive uncertainty. We demonstrate, through our experiments, that our
framework is able to outperform current techniques for Alzheimer's disease
diagnosis
Video Adverse-Weather-Component Suppression Network via Weather Messenger and Adversarial Backpropagation
Although convolutional neural networks (CNNs) have been proposed to remove
adverse weather conditions in single images using a single set of pre-trained
weights, they fail to restore weather videos due to the absence of temporal
information. Furthermore, existing methods for removing adverse weather
conditions (e.g., rain, fog, and snow) from videos can only handle one type of
adverse weather. In this work, we propose the first framework for restoring
videos from all adverse weather conditions by developing a video
adverse-weather-component suppression network (ViWS-Net). To achieve this, we
first devise a weather-agnostic video transformer encoder with multiple
transformer stages. Moreover, we design a long short-term temporal modeling
mechanism for weather messenger to early fuse input adjacent video frames and
learn weather-specific information. We further introduce a weather
discriminator with gradient reversion, to maintain the weather-invariant common
information and suppress the weather-specific information in pixel features, by
adversarially predicting weather types. Finally, we develop a messenger-driven
video transformer decoder to retrieve the residual weather-specific feature,
which is spatiotemporally aggregated with hierarchical pixel features and
refined to predict the clean target frame of input videos. Experimental
results, on benchmark datasets and real-world weather videos, demonstrate that
our ViWS-Net outperforms current state-of-the-art methods in terms of restoring
videos degraded by any weather condition
Why Deep Surgical Models Fail?: Revisiting Surgical Action Triplet Recognition through the Lens of Robustness
Surgical action triplet recognition provides a better understanding of the
surgical scene. This task is of high relevance as it provides to the surgeon
with context-aware support and safety. The current go-to strategy for improving
performance is the development of new network mechanisms. However, the
performance of current state-of-the-art techniques is substantially lower than
other surgical tasks. Why is this happening? This is the question that we
address in this work. We present the first study to understand the failure of
existing deep learning models through the lens of robustness and explainabilty.
Firstly, we study current existing models under weak and strong
perturbations via adversarial optimisation scheme. We then provide the
failure modes via feature based explanations. Our study revels that the key for
improving performance and increasing reliability is in the core and spurious
attributes. Our work opens the door to more trustworthiness and reliability
deep learning models in surgical science